Identification of functional information subgraphs in complex networks 
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We present a general information theoretic approach for identifying functional subgraphs in com- 
plex networks where the dynamics of each node are observable. We show that the uncertainty in the 
state of each node can be expressed as a sum of information quantities involving a growing number 
of correlated variables at other nodes. We demonstrate that each term in this sum is generated by 
successively conditioning mutual informations on new measured variables, in a way analogous to a 
discrete differential calculus. The analogy to a Taylor series suggests efficient search algorithms for 
determining the state of a target variable in terms of functional groups of other degrees of freedom. 
We apply this methodology to electrophysiological recordings of networks of cortical neurons grown 
m vitro. Despite strong stochasticity, we show that each cell's patterns of firing are generally ex- 
plained by the activity of a small number of other neurons. We identify these neuronal subgraphs 
in terms of their mutually redundant or synergetic character and reconstruct neuronal circuits that 
account for the state of each target cell. 



Information plays a central role in conditioning struc- 
ture and determining collective dynamics in many com- 
plex systems. For example, the ability to process and re- 
act to information certainly influences how neurons and 
synapses, or genes and proteins, interact in large num- 
bers to generate the complexity of cognitive and biolog- 
ical processes. Despite their importance, however, sys- 
tematic methodologies for identifying functional relations 
between units of successive complexity, involved in infor- 
mation processing and storage, are still largely missing. 

Motivated by recent theoretical developments and ex- 
perimental breakthroughs, new interest has arisen in ap- 
plications of information theory to dynamical and sta- 
tistical systems with many degrees of freedom P, 0, [3| ■ 
Specifically, it has been shown that information quanti- 
ties can identify and classify spatial 4] and temporal @ 
correlations, and reveal if a group of variables may be 
mutually redundant or synergetic 0, Q • In this way an 
information theoretic treatment of groups of correlated 
degrees of freedom can reveal their functional roles in 
terms of arrangements that can serve as memory struc- 
tures or those capable of processing information. 

The application of these insights to identify func- 
tional connectivity structure is still just beginning ^] but 
should provide a useful complement to other established 
approaches [1, [13, [lH by directly relating observable 
dynamics or statistics to information structures. To date, 
the identification of functional relations between nodes of 
a complex network has relied on the statistics of motifs. 
These are specific (directed) subgraphs of k nodes that 
appear more abundantly than expected in randomized 
networks with the same number of nodes and degree of 
connectivity [1, [3, [13] . Although powerful for small sub- 
graphs, this approach scales up poorly since the number 
of different subgraphs explodes combinatorially with in- 
creasing number of nodes k. Consequently, the extensive 
searches that are necessary for measuring motif frequen- 
cies become prohibitive beyond about fc > 5. A general 



solution to this curse of dimensionality is to perform tar- 
geted searches guided by quantitative expectations for 
finding the most informative node combinations relative 
to an external signal or to other parts of the system. 

Here we present such an approach, based on the rigor- 
ous properties of information theory applied to the cor- 
related statistical state of many variables. We show how 
the uncertainty in the state of any target variable, quan- 
tified by its Shannon entropy, can be expressed in terms 
of a cluster expansion of information quantities involv- 
ing a successively larger number of variables. The sign 
and magnitude of each term in the expansion determines 
the functional connectivity among nodes to that order; 
specifically whether a set of k nodes is functionally inde- 
pendent, redundant, or synergetic. Because the Shannon 
entropy is positive definite, this expansion gives a sys- 
tematic approximation to the state of the target. As a 
result the expansion can be truncated at any order and 
used to construct approximate non-exhaustive search al- 
gorithms, analogous to gradient methods in other opti- 
mization problems. We demonstrate the efficacy of this 
method through its application to spike time series of 
cortical neuronal networks grown in vitro. 

Information is a relative quantity, quantifying the in- 
crease in predictability (reduction in uncertainty) of a 
variable's statistical state given knowledge of others with 
which it is correlated. Specifically, the uncertainty in 
the state of X can be quantified by its Shannon en- 
tropy d2] S{X) ~ —J2xPi^)^°S2P{^)^ where p(a;) are 
the marginals for each state x of X. Note that S{X) > 0, 
where S{X) = corresponds to precise knowledge of X 
and the probability distribution p{x) — 1 for some state 
X. Measuring correlated variables Yi to X contributes to 
knowledge of its state and reduces its uncertainty, thus 

S{X) > S{X\{Y}k-i) > S{X\{Y}k), (1) 

with k < n for n total variables and where S(X\Y) refers 
to the conditional entropy of X given Y jl2l|. We use 



2 



the notation {Y}k to refer to the set Yi,...,!^. The 
difference between the entropy of X and its entropy given 
the joint state of a set {Y}/. is the information in the set: 

I{X; {YU) = S{X) - S{X\{Y}k) > I{X; {Y},^^). (2) 

These relations also specify the optimization problem of 
minimizing the uncertainty in X given k measurements 
within a larger (possibly infinite) set. Specifically, 
if a set exists at some order k so that S{X\{Y}k) = 0, 
and therefore I{X;{Y}k) = S{X), then it fully deter- 
mines the state of X and no uncertainty remains. Each 
measurement can only reduce or leave unchanged S{X), 
while information quantities are symmetric under per- 
mutation of the li, so that the maximal entropy reduc- 
tion from any given set {Y}k is unique. The challenge 
resides in finding the measurement set of size k result- 
ing in the smallest remaining uncertainty. The compu- 
tational complexity of this search grows combinatorially 
with the number of arrangements of size k within n vari- 
ables, which quickly becomes prohibitive. To evade this 
problem, we introduce the exact expansion 



S{X\{Y}k) - S{X) = -I{X;{Y}k) 



(3) 



E 



ASjX) 



E 



A^SjX) 
AY AY, 



+ 



A'^SjX) 
AYi... AYk 



The variational operators in Eq. ^ define the change in 
entropy resulting from a measurement as 
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and so on. Higher order variations follow automatically 
from the successive application of the first variation, re- 
sulting in a simple chain rule. Thus, variations to any 
order k are symmetrical under permutations of the 1^. 

This expansion has two important properties. First, 
each term in the expansion at order k accounts for an 
irreducible set of correlations among a size-fe group of Yi 
nodes with the target X. Statistical independence among 
any of the Yi results in a vanishing contribution to that 
order and terminates the expansion. For example, if all 
Yi are mutually independent, all variations for fc > 1 
vanish identically and the information about X is given 
by J2i Yi), that is, the first order terms in Eq. 
If the Yi are correlated in pairs, but not in higher order 
multiplets, then only terms with k < 2 will be present, 
and so on. Thus, for a system where not all correlations 
are realized, expression Eq. ^ allows the identification 
of correlated submultiplets, and determines their mutual 
organization in specifying the state of X . 
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FIG. 1: (a) Neuronal culture over a microelectrode array 
(white circles; bar= 100/xm). (b) Detail of a spike timeseries. 
The box shows network state 011101100 (bottom to top). 



The second important property of this expansion is 
that the sign of each nonvanishing variation reveals the 
informational character of the corresponding multiplct. 
Specifically, a negative sign indicates that the fc-multiplct 
contributes to the state of X with more information than 
the sum of all its subgroups (synergy), while a positive 
sign indicates the opposite (redundancy). We define a 
synergetic (redundant) core as a set {Y}k such that its 
variation and the variations of all its subgroups of two 
or more variables are negative (positive). Explicit ex- 
amples where the Yi are inputs of a logical circuit and 
X is the output (e.g. an AND circuit) confirm that the 
sign of any variation of the Yi identifies synergetic ar- 
rangements to any order. Likewise, arrangements where 
the same information is shared among some of the Yi, 
as in a Markov chain, result in the sign of the vari- 
ation indicating redundancy. Examples of these rela- 
tions to low orders (fc < 3) have been worked out re- 
cently [2, iZl, and their detailed generalization will appear 
elsewhere [iSj. We also note that the concept of order- 
by-order synergy or redundancy captured by each of the 
terms in Eq. ([3]) generalizes the coefficient of redundancy 
i?f {X, {Y}k) = EL HX; Yi)-I{X- {Y}k) proposed by 
Schneidman et al. [6| , which refers to the global informa- 
tion deficit (or excess if i?f < 0) of a multiplet, relative 
to only the first term in Eq. ([3]). 

For the remainder of this Letter, we use the expansion 
in Eq. ([3]) to define the optimization problem of deter- 
mining the set and decomposition of the Yi in terms of 
functional information arrangements that best account 
for the stochastic behavior of a target X. Because the 
entropy S'(X|{y}fc) > for all fc, this approach defines 
a well posed optimization problem, with a single global 
minimum for each set of possible measurements. 

To illustrate this methodology, we apply it to tempo- 
ral action potential activity from murine frontal cortex 
neuronal cultures grown in vitro on non-invasive micro- 
electrode arrays (MEAs) [3, [S IH. Fig. [^a) shows 
an example network growing on an MEA and Fig. [TJb) 
typical time series data. Details of MEA fabrication and 
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FIG. 2: Entropy remaining when other neurons are measured 
with respect to neuron 46. Rank is determined by maximizing 
the variation to various orders. The neuron numbers appear 
for the exact curve. Inset: Histogram of entropy of each neu- 
ron remaining after all possible measurements. 



culture preparation are described elsewhere 0, [H, HI- 
These experimental platforms have become model sys- 
tems for studying living neuronal networks in controlled 
environments. Recent progress includes studies of dy- 
namical patterns of collective activity H, 1^ 2^ 21. 2a|. 
connectivity structure 0, [l^, network growth and de- 
velopment [2lJ. and even learning and activity pattern 
modification [24. [25l [2^ via external stimulation. Re- 
sults presented here refer to 62 cells of a mature (42 
days in vitro; see [27[) cortical network. To analyze pat- 
terns of neuronal activity, binary states are constructed 
[see Fig. [IJb)] for each recorded neuron's time series us- 
ing temporal bins of 10 ms; 1 is recorded if a neuron 
fires during within a bin and otherwise. Probability 
distributions for states of k neurons are estimated via 
frequencies and provide the basis for calculating infor- 
mation theoretic quantities. Probabilities are considered 
significant if substantially larger than from a null model 
with randomized spiking at observed rates for each neu- 
ron. Nearly all of the network activity occurs as global 
coordinated spikin g ev ents, known as network bursts or 
avalanches [11, HJlillli]. 

Fig. [2] shows the relative entropy reduction of a target 
neuron, due to successive measurements of other neurons. 
Different lines correspond to searches for the optimal se- 
quence of measurements at different orders of approxi- 
mation in the expansion in Eq. ([3]). A search to exact 
order means that all I{X; {Y}k) are considered, given 
the previous {Y}k-i, and the set {Y}k with greatest in- 
formation gain is chosen. Most neurons show an initial 
large drop in entropy due to the measurement of only a 
few other cells in the network (typically < 5) and a subse- 
quent slower information gain as more cells are measured. 

Fig. [2j inset) shows the histogram of the ratio of final to 
initial entropy for all 62 neurons. Final entropy refers to 
the fraction of a neuron's initial entropy left unaccounted 
for once the set of all other available neurons is measured. 






(a) 



25 .50 
Neurons measured (fc) 



0.01 



0.00 



-0.01 




2 3 4 5 6 7 8 
(b) Order (k) 



FIG. 3: (a) Sorted global information deficit/excess of a mul- 
tiplet, relative to the sum of the pairwise mutual informations: 
i?f . (b) Values of each term in the expansion in Eq. ([3} vs. 
k for 36,000 randomly sampled variable combinations. White 
to blue: - 0.5%; red to yellow: 0.5 - 100%. 



Remarkably, the stochastic patterns of most cells can be 
nearly fully predicted by the activity of others, even if 
most degrees of freedom in the actual network remain 
unobserved (we estimate that only about 5 — 10% of all 
neurons are measured). To better understand the infor- 
mational nature of arrangements of neurons we show in 
Fig.[3ja) i?f for each of the measured cells in the network. 
By this measure most cell groups are globally redundant 
(red) relative to their decomposition in terms of purely 
binary correlations to other cells. About a third of the 
cells, though, show substantial synergy (blue) that per- 
sists despite many sequential measurements. Fig. |3Ib) 
shows the distribution of each term in the expansion in 
Eq. ^ to order k. We include all multiplets up to order 
k = 2, and thereafter use a random sample of 36, 000 
multiplets. Recall that the value and sign of each term 
in the expansion indicates redundancy or synergy rela- 
tive to the sum of all submultiplets of lower order. Glob- 
ally redundant multiplets often result in terms with al- 
ternating signs to lower orders, while a smaller number 
of multiplets corresponding to synergetic arrangements 
have negative contributions at every order. 

Fig. m^a) shows the frequency of synergetic and re- 
dundant cores, while Fig. [3)Jb) shows the reconstruction 
of circuits from functional subgraphs which account for 
the activity of target neuron 46 of Fig. [51 Evidently the 
target neuron is part of both redundant and synergetic 
functional multiplets, with the former being substantially 
more abundant. The most informative neuron is labeled 
42, but its information about the target is shared to a 
large extent with neurons 14 and 19. The target neuron 
is also part of a synergetic circuit with other neurons, 
several of which are part of smaller mutually redundant 
subgraphs. Some of these can, at least partially, be in- 
terchanged with other neurons carrying the same infor- 
mation, resulting globally in an interconnected ensemble 
where specific synergetic functional relationships are em- 
bedded on robust redundant cell arrangements. 

In summary, we present a new information theoretic 
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FIG. 4: (a) Frequency of redundant (red) and synergetic 
(blue) cores versus size k. (b) Purely redundant (red) and 
purely synergetic (blue) circuits relative to neuron 46. Neu- 
rons and groups with the most information about 46 are clos- 
est to the center; c.f. Fig[2] Arcs identify neurons that par- 
ticipate in multiple functional groups. 



approach to constructing functional subgraphs in com- 
plex networks where nodes display observable stochastic 
dynamics. By performing targeted searches guided by 
expected information gain from new measurements, we 
avoid some of the combinatorial issues usually involved in 
the search for motifs in complex networks. We apply this 
approach to action potential time series from networks of 
neurons and find that the activity of most neurons is to a 
large extent determined by the observation of other cells 
in the network. This finding is remarkable because only 
a small portion (5 — 10%) of cells are accessible to mea- 
surement, indicating that large amounts of redundancy 
characterize neural network dynamics in these cultures. 
Although the activity of many neurons can be substan- 
tially accounted for by a relatively small number of other 
cells, an important fraction of a neuron's entropy and 
detailed firing patterns is contained in multiple cell ar- 
rangements of varying size. These findings agree well 
with recent neuronal network reconstructions in terms of 
binary correlations ^29| and small multiplets f?! , but also 
provide a new view of the contribution of higher order 
functional correlations. The identification of functional 
connectivity subgraphs in living neuronal cultures is crit- 
ical for designing future experiments that promote com- 
putational tasks within neural networks, and should find 
applications more generally in other complex systems. 
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